Data Report — Heart Disease

Source: UCI dataset 45

Dataset metadata

Description

4 databases: Cleveland, Hungary, Switzerland, and the VA Long Beach

Variables and summary

variable description inferred declared dist
age continuous Integer 54.5421 ± 9.0497 [29, 48, 56, 61, 77]
sex discrete Categorical 201 (67.68%)
cp discrete Categorical 4: 142 (47.81%)
3: 83 (27.95%)
2: 49 (16.50%)
1: 23 (7.74%)
trestbps resting blood pressure (on admission to the hospital) continuous Integer 131.6936 ± 17.7628 [94, 120, 130, 140, 200]
chol serum cholestoral continuous Integer 247.3502 ± 51.9976 [126, 211, 243, 276, 564]
fbs fasting blood sugar > 120 mg/dl discrete Categorical 43 (14.48%)
restecg discrete Categorical 0: 147 (49.49%)
2: 146 (49.16%)
1: 4 (1.35%)
thalach maximum heart rate achieved continuous Integer 149.5993 ± 22.9416 [71, 133, 153, 166, 202]
exang exercise induced angina discrete Categorical 97 (32.66%)
oldpeak ST depression induced by exercise relative to rest continuous Integer 1.0556 ± 1.1661 [0, 0, 0.8, 1.6, 6.2]
slope discrete Categorical 1: 139 (46.80%)
2: 137 (46.13%)
3: 21 (7.07%)
ca number of major vessels (0-3) colored by flourosopy continuous Integer 0.6768 ± 0.9390 [0, 0, 0, 1, 3]
thal continuous Categorical 4.7306 ± 1.9386 [3, 3, 3, 7, 7]
num diagnosis of heart disease discrete Integer 0: 160 (53.87%)
1: 54 (18.18%)
2: 35 (11.78%)
3: 35 (11.78%)
4: 13 (4.38%)

Fidelity summary

model backend disc_jsd_mean disc_jsd_median cont_ks_mean cont_w1_mean
MetaSyn metasyn 0.1039 0.098 0.2835 2.5619
clg_mi2 pybnesian 0.1003 0.0995 0.2344 4.4109
semi_mi5 pybnesian 0.1003 0.0995 0.2344 4.4109
ctgan_fast synthcity 0.4269 0.4085 0.686 30.8935
tvae_quick synthcity 0.1021 0.1128 0.2007 6.072

Models

UMAPDetailsStructure

Real data

MetaSyn

Model: clg_mi2 (pybnesian)

Model: semi_mi5 (pybnesian)

Model: ctgan_fast (synthcity)

Model: tvae_quick (synthcity)